Summary

TEM Distaster Server Architecture (DSA) has a highly sophisticated built-in ability to install multiple TEM Servers that will replicate information from each other for the purpose of disaster recovery. In the event of a failure of one TEM Server, other TEM Servers will automatically takeover as fully-functional TEM Servers (will receive data from the TEM Relays and TEM Clients and accept TEM Console connections). When the failed TEM Server is restored, it will automatically receive updated information.

 

DSA Expectations

The DSA architecture health is dependent upon the health and efficiency of the database replication process facilitated by the FillDB service. If actions are successfully propagated (in the Console) and the database has successfully replicated (see the Replication tab in the TEM Admin tool) actions will run appropriately on all child endpoints.

However if the primary DSA server is disabled after action propagation but before successful database replication then the secondary will NOT have the newly propagated actions. In this case children of the Secondary will not be provided the version of the actionsite that contains changes prior to replication and as such the Secoondary will not receive the new actions or associated downloads for those actions. In this case the desired actions would need to be taken again from the Secondary.

In all cases clients will continue to be provided an actionsite (containing open actions) for gathering and you can continue to manage the deployment and take new actions from the Secondary server.

The term 'high availability' refers to disaster or event recovery that is immediate and indeed in any case a secondary DSA server is immediately available for deployment management. However, the actionsite and content data that is being replicated between the Primary and Secondary servers should not be considered 'high availability' as it depends on a database replication process rather than a real time load balanced and concurrent data process.

 

DSA Installation Instructions

See the TEM Administrator's Guide at http://support.bigfix.com/resources.html for more information about DSA.

 

DSA Requirements

Authenticating Additional Servers (DSA)

Multiple servers can provide a higher level of service for your TEM installation. If you choose to add Disaster Server Architecture (DSA) to your TEM installation, you will be able to recover from network and systems failures automatically while continuing to provide local service. To take advantage of this functionality, you will need one or more additional servers with a capability at least equal to your primary server. Because of the extra expense and installation involved, you should carefully think through your needs before committing to DSA.

First, you must decide how you want your TEM Servers to communicate with each other. There are three inter-server authentication options: the first two are flavors of NT and the third is SQL. Because it is more secure, IBM recommends NT Authentication. You can't mix and match; all TEM Servers must use the same authorization. Here are the instructions for each option:

Using NT Authentication with Domain Users/User Groups

When using this technique, each TEM Server uses the specified domain user or a member of the specified user group to access all other TEM Servers in the deployment. To authenticate your TEM Servers using Domain Users/User Groups, follow these steps:

  1. Create a service account user or user group in your domain. For a user group, add authorized domain users to your TEM Servers. You may need to have domain administration privileges to do this.
  2. On the Master TEM Server, use SQL Enterprise Manager (or the SQL 2005 Management Studio) to create a login for the domain service account user or user group, with a default database of BFEnterprise, and give this login System Admin (sa) authority. System Admin authority is required in order for operator accounts to be replicated due to SQL Server requirements.
  3. On the Master TEM Server, change the LogOn settings for the TEM FillDB service to the domain user or member of the user group created above, and restart the service.

 

Using NT Authentication with Domain Computer Groups

When using this technique, each TEM Server is added to a specified domain computer group and each server accepts logins from members of that domain group. To authenticate your TEM Servers using Domain Computer Groups, follow these steps:

  1. Create a Global Security Group in your domain containing each desired TEM Server. You may need to have domain administration privileges to do this.
  2. After creating the group, each server will need to be rebooted in order to update its domain credentials.
  3. On the Master TEM Server, use SQL Enterprise Manager (or the SQL 2005 Management Studio) to create a login for the domain group, with a default database of BFEnterprise, and give this login System Admin (sa) authority. System Admin authority is required in order for operator accounts to be replicated due to SQL Server requirements.

Using SQL Authentication

When using this technique, each TEM Server is given a login name and password, and is configured to accept the login names and passwords of all other TEM Servers in the deployment. Be aware that the password for this account is stored in clear-text under the HKLM branch of the registry on each TEM Server. To authenticate your TEM Servers using SQL Authentication, follow these steps:

  1. Choose a single login name (for example, 'besserverlogin'), and a single password to be used by all servers in your deployment for inter-server authentication.
  2. On the Master TEM Server, use SQL Enterprise Manager (or SQL 2005 Management Studio) to create a SQL Server login with this name. Chose SQL Server Authentication as the authentication option and specify the password. Change the default database to BFEnterprise and grant it System Admin (sa) authority. System Admin authority is required in order for operator accounts to be replicated due to SQL Server requirements.
  3. On the Master TEM Server, add the following String values under the key HKLM\Software\BigFix\Enterprise Server\FillDB:

    ReplicationUser = [login name]

    ReplicationPassword = [password]
  4. Restart the BigFix FillDB service.

Note: This choice must be made on a deployment-wide basis; you cannot mix domain-authenticated servers with SQL-authenticated servers. Also, all TEM servers in your deployment must be running the same version of SQL Server.

 

Installing Additional Servers (DSA)

Before proceeding with this section, determine your authentication method and complete the appropriate steps in the preceding Authenticating Additional Servers (DSA) section.

For each additional TEM Server you wish to add to your deployment, make sure they are communicating with each other, and then follow these steps:

 

 

  1. Install the same SQL Server version being used by the Master TEM Server.
  2. Run the TEM Server installer on each machine that you wish to configure as an additional BigFix Server. You should use the same administrative account that you used for the local SQL Server install (so you have sa authority).
  3. If you're extracting the server installer from the TEM Installation Generator, select Production Deployment, and I want to install with an existing masthead. Specify the masthead.afxm file from the Master TEM Server. Otherwise, use the Server install package from the BESInstallers folder on the Master TEM Server.
  4. On the Select Database Replication page of the server installer, select Replicated Database.
  5. On the Select Database page, select Local Database to host the database on the server (typical for most applications).
  6. Proceed through the installer screens as usual until the installer gets to Configuring your new installation and prompts you with a Database Connection dialog box. Enter the hostname of your master server, and the credentials for an account that can log into the master server with DBO permissions on the BFEnterprise database.
  7. The Replication Servers window shows you the TEM Server configuration for your current deployment. By default, your newly installed TEM Server should be configured to replicate directly from the master server every 5 minutes. You can adjust this as necessary.
  8. Use SQL Enterprise Manager (or SQL 2005 Management Studio) to create the same SQL Server login you created earlier on the Master TEM Server with BFEnterprise as the default database and System Admin (sa) authority or the DBO role on the BFEnterprise and master databases.
  9. On the newly-installed server, run the TEM Administration Tool and select the Replication tab to see the current list of servers and their replication periods. Select the newly installed server from the pull-down menu, and verify in the list below that it is successfully connected to the master server. Then select the master server in the server dropdown, and verify that is properly connected to the new server. You may need to wait for the next replication period before both servers show a successful connection.



    Note: The initial replication could take several hours depending on the size of your database. Wait for the replication to complete before taking any actions from a Console connected to the replica TEM Server.
  10. You can see a graph of the servers and their connections by clicking the Edit Replication Graph button. You can change the connections between servers by simply dragging the connecting arrows around.

Verify and Manage DSA Replication

The Replication tab of the TEM Admin tool is the only way to properly verify successful replication between Servers. The tool will report important information such as Server, Distance, Expected Latency, Last Replication Time, and Last Error Message each of which can be used to troubleshoot any issues.

If you believe you are experiencing an error you can further troubleshoot by referring to the Filldb.log located by default in the following location: C:\Program Files\BigFix Enterprise\BES Server\FillDBData

Note: Please be patient as initial replication can and will take time depending upon database size and latency between Servers.

 

Uninstalling a Replication Server

If you have a Disaster Server Architecture (DSA) deployment and one of the TEM Root Servers has been removed from the deployment, you can mark it as deleted so it won't show up in TEM Admin. You can use the delete_replication_server procedure stored on the BFEnterprise database to remove a TEM Server. Be careful not to delete the wrong server, or you may lock yourself out. Here's how to proceed:

  1. Stop the TEM Root Server service to discontinue the receipt of new reports.
  2. Wait for FillDB to finish processing new reports, and then Stop the TEM FillDB service.
  3. Wait for the Master Server to run through a couple of replication cycles to clear or process remaining reports.
  4. Run the delete_replication_server procedure on Master server, within SQL Server:

A. Open the BFEnterprise database with sa rights and open a new query window. Enter the following query, which deletes a server with the name of MyRootServerdeclare @serverid intselect @serverid = (select ServerID from REPLICATION_SERVERS where DNS like '%MyRootServer%' )exec delete_replication_server @serverid

B. Restart the TEM Admin tool to update it with the changes.

 

Download the BESRemove utility from the BigFix site: http://support.bigfix.com/bes/install/downloadutility.html#besremove

 

  1. Run the utility from the Replication Server you wish to remove, then select the server and its databases for removal.



 

Depending on which authentication method used during installation, (NT Authentication or SQL Authentication), perform the following:



For NT Authentication:

 

  1. Remove the login account that was created from: On the Master TEM Server, use SQL Enterprise Manager (or SQL Management Studio) to create a SQL Server login with this name. Chose SQL Server Authentication as the authentication option and specify the password. Change the default database to BFEnterprise and grant it System Admin (sa) authority or the db_owner role for the BFEnterprise and master databases.
  2. Give this login System Admin (sa) authority or the DBO (DataBase Owner) role on the BFEnterprise and master databases.





For SQL Authentication:

 

  1. Remove the login account that was created from: On the Master TEM Server, use SQL Enterprise Manager (or SQL Management Studio) to create a SQL Server login with this name. Chose SQL Server Authentication as the authentication option and specify the password. Change the default database to BFEnterprise and grant it System Admin (sa) authority or the db_owner role for the BFEnterprise and master databases.
  2. Remove the following registry setting: On the Master TEM Server, add the following string values under the key:HKLM\Software\BigFix\Enterprise Server\FillDB: ReplicationUser = ReplicationPassword =

    You may need to delete the registry values from the DSA secondary server too.
  3. Remove the ODBC entries for the remote database connections on both servers.

 

Restart the TEM FillDB service on the Master server.

 

Configuration for Relay Failover

In order for the failover process to successfully occur you must set the DSA Server as the Secondary Relay in client settings (manually using __RelayServer2) for the top-level Relays (or via the Console Computer right-click settings user interface). When a failure on the Primary TEM Server occurs and lower level Relays are unable to report they will use the Secondary relay value during normal relay selection process to find and report to the Secondary Server.

Note: The failover process is not immediate and depends on the setting (_BESClient_RelaySelect_ResistFailureIntervalSeconds) set at 10 minutes by default on top level TEM Relay's. A properly configured Relay architecture will allow the entire deployment to begin to fail over in about 10 minutes by default. Deployments that do not have a Relay infrastructure setup can take significantly longer.

 

Message Level Encryption and DSA

If Message Level Encryption is enabled and clients are set using "Task: BES Client Setting: Encrypted Reports" the Server's encryption key should be moved to the Secondary DSA Server. This will allow the DSA Server to process reports from encyrpted clients during normal operations or in the event of an outage on the Primary Server.

 

How can I increase the replication interval for DSA?

If you are using Distributed Server Architecture (DSA) and replication is failing with the error message 'Replication was interrupted to process server database insertions.' in the BES Administration tool, you'll need to raise the maximum amount of time spent doing replication on the TEM Server that is failing.



To increase the maximum replication time, set the following registry key on the TEM Server.



[HKEY_LOCAL_MACHINE\SOFTWARE\BigFix\Enterprise Server\FillDB]

* UnInterruptableReplicationSeconds (DWORD): Seconds

Default: 30



Note: You must restart the FillDB service for changes to take effect.



By raising the value, the TEM Server will spend more time performing replication each time it attempts to do so based on the replication interval. The error is caused because the TEM Server is unable to complete replication using the default value.



For larger deployments of TEM, try a value of 60-120 seconds. If you are installing a new TEM Server, you might raise the value to 300-600 seconds during the initial replication period to reduce the amount of time spent initializing the new TEM Server.